Detection of vowel on set points in continuous speech using autoassociative neural network models
نویسندگان
چکیده
Detection of vowel onset points (VOPs) is important for spotting subword units in continuous speech. For consonant-vowel (CV) utterances, VOP is the instant at which the consonant part ends and the vowel part begins. Accurate detection of VOPs is important for recognition of CV units in continuous speech. In this paper, we propose an approach for detection of VOPs using autoassociative neural network (AANN) models. A pair of AANN models are trained for each CV class to capture the characteristics of speech signal in the consonant and vowel regions of that class. The trained AANN models are then used to detect VOPs in continuous speech. The results of studies show that the proposed approach leads to significantly less number of spurious hypotheses.
منابع مشابه
Spotting Multilingual Consonant-Vowel Units of Speech Using Neural Network Models
Multilingual speech recognition system is required for tasks that use several languages in one speech recognition application. In this paper, we propose an approach for multilingual speech recognition by spotting consonant-vowel (CV) units. The important features of spotting approach are that there is no need for automatic segmentation of speech and it is not necessary to use models for higher ...
متن کاملExcitation Source Features for Improving the Detection of Vowel Onset and Offset Points in a Speech Sequence
The task of detecting the vowel regions in a given speech signal is a challenging problem. Over the years, several works on accurate detection of vowel regions and the corresponding vowel onset points (VOPs) and vowel end points (VEPs) have been reported. A novel front-end feature extraction technique exploiting the temporal and spectral characteristics of the excitation source information in t...
متن کاملSpeaker change detection in casual conversations using excitation source features
In this paper we propose a method for speaker change detection using features of excitation source of the speech production mechanism. The method uses neural network models to capture the speaker-specific information from a signal that represents predominantly the excitation source. The focus in this paper is on speaker change detection in casual telephone conversations, in which short (<5 s) s...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملUnsupervised Speaker Segmentation using Autoassociative Neural Network
In this paper we propose an unsupervised approach to speaker segmentation using autoassociative neural network (AANN). Speaker segmentation aims at finding speaker change points in a speech signal which is an important preprocessing step to audio indexing, spoken document retrieval and multi speaker diarization. The method extracts the speaker specific information from the Mel frequency cepstra...
متن کامل